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INTRODUCTION 


In  response  to  Mftny  inquiries  from  users  of  the  Thesaurus  of  ASTIA 
Descriptors  and  from  othe-s  interested  in  the  use  of  controlled  vocabu¬ 
laries  for  mec hani re d  information  retrieval,  this  paper  is  offered  as  an 
outline  of  the  general  plan  to  be  followed  in  the  preparation  of  a  Second 
Edition  of  the  ASTIA  Thesaurus.  The  philosophies  of  the  descriptor  and 
thesaurus  approaches  to  information  retrieval  are  discussed,  with  partic¬ 
ular  emphasi6  on  the  relationships  among  descriptors.  Although  this 
document  was  intended  as  a  guideline  for  individuals  who  had  been  invited 
to  participate  in  the  preparation  of  the  Second  Edition,  the  discussion 
of  the  thesaurus  philosophy  i8  believed  to  be  of  general  interest  to 
document all st s.  Included  is  a  bibliography  which  cites  papers  dealing 

with  the  general  concept  of  technical  vocabularies. 
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PHILOSOPHY  OF  AND  GUIDELINES  FOR 
REVISION  OF  THE  ASTIA  THESAURUS 


I.  INTRODUCTION 

A .  General  Objectives 

During  the  October  17-18 >  1961,  meeting  of  individuals  and  organiza¬ 
tions  interested  in  revision  of  the  ITiesaurus  of  ASTLA  Descriptors,  an 
ad  hoc  temporary  committee  on  Tbesaima  revision  submitted  a  report 
{"attached )  containing  a  number  of  suggestions  which  appealed  to  meet  with 
the  approval  of  the  assembled  group.  In  accordance  with  these  suggestions, 
this  outline  of  the  philosophy  of,  and  guidelines  for,  Thesaurus  revision 
is  provided  . 

A  major  objective  in  revising  the  Tiesaurus  of  ASTLA  Descriptors  is 
to  provide  an  Improved  ASTLA  indexing  authority  in  a  form  most  useful 
(l)  to  assist  analysts  in  making  consistent  and  sufficiently  complete 
assignment  of  descriptors  to  accessioned  technical  information  and  (2) 
to  assist  bibliographers  in  making  a  corresponding  consistent  use  of  the 
descriptors  during  the  formulation  of  inquiries  for  mechanized  retrieval. 

A  second  major  objective  in  revising  the  Thesaurus  is  to  create  a 
device  which  will  be  as  useful  as  possible  to  reference  personnel  in 
organizations  other  than  A1TIA .  In  this  connection,  ASTIA  is  anxious 
during  revision  of  the  Thesaurus  to  have  the  cooperation  and  active 
participation  of  all  individuals  and  organizations  who  can  assist  in 
making  the  Thesaurus  more  useful  both  to  themselves  and  to  ASTIA. 

Edition  II  of  the  Thesaurus  will  incorporate  the  planned  revisions. 

In  addition,  ASTIA  Is  investigating  means  of  notifying  the  users  of  its 
*rhesaurus  of  s ubsequent  changes,  additions,  and  modifications  to  the 
Thesaurus .  A  number  of  alternative  methods  of  performing  this  notifi¬ 
cation  function  are  possible,  and  it  is  felt  that  this  function  will  be 
easily  performed  --  particularly  because  the  rate  of  Thesaurus  modifi¬ 
cation  in  the  future  Is  expected  to  be  low. 

B .  Deflnlt ions 

To  make  meaningful  the  philosophy  uf  ASTIA  (as  well  as  the  herein 
contained  guidelines  for  Thesaurus  revision),  certain  terms  and  ideas 
must  be  defined  . 

1  .  fes  c  r 1 ptors 

Descriptors  are  controlled  terms  --  single  word*  or  phrases  -- 
representing  ideas  or  concepts  .  Descriptors  are  used  to  indicate  the 
subject  matter  content  of  documents  and  technical  information  in  other 
forma.  Descriptors  are  to  be  distinguished  from  names  of  personal  or 
corporate  authors,  from  expressions  giving  contract  numbers,  and  from 
other  similar  important  kinds  of  access  points  of  descriptive  cataloging. 
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The  word  or  phrase  constituting  each  descriptor  is  chosen  so  that  it  will 
possess  the  maximum  suggestiveness  and  convenience  in  Indicating  the 
descriptor's  particular  idea  or  concept  to  the  technological  or  scientific 
group  concerned.  For  example,  the  expression  biological  stains  is  more 
convenient  to  use  than  the  inverted  stains,  biological  in  indicating  the 
concept  of  this  class  of  stains  and  of  their  use .  Because  several  different 
functions  are  to  be  served  by  the  descriptors,  three  broad  types  of 
descriptors  are  employed  in  the  ASTIA  Thesaurus ■ 

a .  Type  A  Descriptors 

Type  A  descriptors  are  controlled  or  standarized  names  of 
subject -related  sets  of  ideas  or  concepts.  To  describe  them  in  another 
way.  Type  A  descriptors  represent  very  broad  or  generic  concepts  .  In 
first  approximation,  they  correspond  to  the  names  of  the  292  descriptor 
groups  included  in  Edition  I  of  the  Thesaurus  of  ASTIA  Descriptors ;  e.g., 
acoustic  detection .  One  Intended  purp°8«  of  the  Type  A  descriptor  of  the 
revised  Thesaurus  is  for  broad  classification  of  technical  information 
in  a  compatible  manner  that  will  facilitate  communication  and  exchange 
between  Information  centers . 

b .  Type  B  Descriptors 

Type  B  descriptors  are  the  controlled  and  standardized 
names  of  suitably  chosen  single  ideas  or  concepts.  Type  B  descriptors 
are  what  are  usually  called  merely  "descriptors."  They  correspond  to  the 
approximately  7,000  descriptors  in  Edition  I  of  the  Thesaurus ;  e.g., 
sonar  receivers  .  Some  of  these  Type  B  descriptors  may  become  Type  A 
descriptors  as  a  result  of  the  current  revision  effort. 

c .  Type  C  Descriptors 

Type  C  descriptors  are  terms  extracted  from  the  information 
being  indexed  to  delineate  information  content  not  dealt  with  by  Type  A 
or  B  descriptors  .  Therefore,  Type  C  descriptors  are  not  completely 
controllid  or  standardized.  The  Type  C  descriptor  terms  must  be  specific 
in  meaning.  Ordinarily  they  will  consist  of  proper  or  code  names  of 
equipment  or  projects,  or  will  be  Important  but  infrequently  used  or 
parochial  terminology;  e.g.;  AN/BQQ-1 ■  In  ASTIA  parlance,  these  terms 
are  called  ldentl ftera  (formerly  known  as  "open-ended  terms").  Type  C 
descriptors  provide  additional  and  Important  points  or  access  to  ASTIA 's 
document  collection. 

d .  General  Discussion 

In  analysis,  descriptors  of  Type  A  and  B  are  associated 
with  each  document  (  or  other  form  of  technical  information)  in  order 
to  delineate  its  subject  matter.  Type  C  descriptors  are  used  as  needed. 

Thus  the  document  is  delineated  by  a  set  of  descriptors .  To  the  extent 
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that  it  is  practicable,  each  applicable  and  relevant  descriptor  of 
'type  A  and  B  in  the  Thesaurus  is  used  to  delineate  any  given  document. 
In  a  fashion,  the  Thesaurus  is  used  as  a  check  list  against  the  subject 
content . 


An  information  search  is  prescribed  by  the  formation  of 
a  small  set  of  descriptors  each  of  which  is  believed  to  be  in  the 
delineating  set  of  the  desired  information.  In  the  ideal  case,  selection 
occurs  when  a  single  small  prescribing  set  is  Included  in  this  delineating 
set.  However,  generally  it  will  be  necessary  to  use  several  prescribing 
sets  to  give  the  full  range  cf  selection  needed. 

One  measure  of  the  effectiveness  of  the  revision  of  the 
AS  TLA  Thesaurus  will  be  how  closely  it  is  possible  to  approach  the  ideal 
of  a  single  prescribing  set  and  a  single  Inclusion  for  the  search  and 
selection  of  technical  information  ,  and  : or  .r*:  «_•.  ‘o  :o  :nde;endent  of 

the  v i:\n-o  of  rtd  lua’  docar-.-  alf. 

2.  Relationships 

The  description  of  documents  for  effective  retrieval  is  a 
communication  process  An  understanding  of  communications  depends  not 
only  upon  the  terminology  (l.e.,  descriptors)  employed  but  also  upon  the 
context  of  that  terminology  as  well  as  the  meaning  inferred  by  the 
recipient  In  the  communication  pattern.  Context  involves  relationships 
among  descriptor  meanings  --  and  there  exist  several  different  kinds  of 
relationships,  which  are  discussed  under  Part  II -A  of  this  paper"!  ft 
relationships  among  descriptors  are  not  specified  In  a  retrieval  system, 
confusion  as  tc  descriptor  meaning  may  develop  during  both  Input  (analysis 
and  indexing)  and  output  ^retrieval).  Cn  the  other  hand,  the  specifica¬ 
tion  of  relationships  among  descriptors  enables  consistent  and  sufficiently 
extensive  use  of  the  vocabulary. 

3 .  Tfr.e.saur  us 

The  ASTIA  Thesaurus  Is  an  authoritative  and  structured  reference 
to  the  ASTIA  vocabulary  cf  descriptors  .  As  such,  It  exhibits  the  relation¬ 
ships  among  descriptors  and  their  relationships  to  words  in  ordinary 
language,  and  clearly  defines  what  sorts  of  relations  exist  among  specific 
descriptors.  This,  In  effect,  assists  greatly  In  defining  each  descriptor 
by  relating  it  In  specified  fashions  to  other  descriptors  as  well  as  to 
groups  of  descriptors  and  to  common  terminology. 

As  such,  the  Thesaurus  of  A3  TLA  Descriptors  constitutes  the 
basic  "tocl"  by  means  cf  which  ASITA's  objectives  (of  providing  an 
authoritative  vocabulary  for  consistent  and  extensive  use  by  analysts 
and  bibliographers)  may  be  achieved  with  a  blgr.  degree  of  simplicity  and 
validity.  Ttils  Is  the  fundamental  expression  of  ASTIA's  philosophy. 

C .  Philosophy 

Ttie  philosophy  employed  in  constructing  a  vocabulary  for  ASTIA  can 
be  described  from  three  viewpoints . 
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1 .  The  Controlled  Vocabulary 


Tbs  vocabulary  must  be  a  controlled  vocabulary.  By  "controlled" 
la  meant  that  an  authoritative  and  definitive  reference  la  provided 
(l.e.,  the  Thesaurus)  both  to  descriptors  and  to  relationships  among 
descriptor  meaning*- --  yet  access  to  the  vocabulary  is  possible  from 
multiple  viewpoints  .  The  Ptesaurus  must  also  provide  an  authoritative 
guide  from  ordinary  technical  or  scientific  word  usage  to  the  controlled 
and  standardized  vocabulary  of  descriptors.  Flexibility  trust  be  maintained 
(but  disorder  not  permitted)  with  reference  to  the  addition  or  da  let  ion  of 
descriptors  as  veil  as  to  addition  or  deletion  of  relationships  among 
descriptors . 

2 .  Competent  Collection  Coverage  of  the  Vocabulary 

The  vocabulary  muat  be  competent  to  deal  with  the  actual  retrieval 
problems  represented  by  the  range,  size,  and  dapth  of  ABTXA's  technical 
Information  collection.  It  must  be  useful  In  the  processing  of  Information 
and  Inquiries  received  by  AS  TLA .  It  must  be  expected  to  encompass  only 
those  technologies  (or  the  terminologies  thereof)  encountered  In  the  A3  TLA 
collection.  Yet,  insofar  as  possible.  It  must  be  useful  bo  other  organiza¬ 
tions  dealing  with  collections  dissimilar  to  ABTXA's. 

3 .  Compatibility  of  the  Vocabulary 

Thus,  the  vocabulary  should  be  as  compatible  as  possible  vltb 
other  similarly -used  vocabularies  --  and  the  Thesaurus ,  as  tbs  principal 
s»na  for  achieving  such  compatibility,  should  make  it  possible  for  otbar 
organizations  to  "translate"  their  vocabulary  to  or  from  that  of  AS  TLA 
and  for  ASTIA  to  do  the  same  vltb  other  vocabularies  .  In  this  respect, 
the  assistance  of  organizations  other  than  ASTIA  will  prove  Invaluable. 

II.  OUIEELIHES 


A .  IntsrdescrlptOT  Relationships 

The  attainment  of  consistent  and  sufficiently  extensive  use  of  the 
ASTIA  vocabulary  during  either  Input  (analysis  and  Indexing)  or  output 
(retrieval)  operation  depends  upon  overcoming  three  basic  communications 
problems,  listed  here  In  Increasing  order  of  difficulty. 

1  -  The  Semantic  Prohler. 

This  Is  toe  problem  vbl<h  may  be  "<»rrovly  defined  as  that  of 
the  meanings  of  words  --  specifically,  the  relationship  betveen  the 
mental  concept  and  the  symbol  which  atandB  for  that  concept.  In  the 
following,  a  distinction  will  be  made  between  words  or  phrases  in  ordinary 
languages,  which  will  be  called  terms ,  and  the  controlled  and  standardized 
expressions  which  w«  have  been  calling  descriptors  .  In  this  narrow 
sense,  there  are  three  aapeats  to  the  semantic  problem. 


a .  Homograghs 


Homographs  are  words  which  are  spelled  the  saae  hut  which 
different  things  --  things  not  at  all  related,  e.g.,  perch  (bird 
roost)  and  perch  (fish),  tank  (vehicle")  and  tank  (container),  lead  (natal) 
and  lead  (electronic  wiring  component),  etc.  Such  concepts  aust  be 
distinguished  one  froa  the  other  or  else  consistency  in  doccaent  descrip¬ 
tion  and  retrieval  cannot  be  achieved . 

b .  Wear  -Synonyms 

Depending  upon  viewpoint  (see  below)  semy  terns  any  be 
synonymous  or  not.  Sane  nay  even  be  synonyms  from  one  viewpoint  and 
antonyms  from  another;  e.g.,  salvage  (reclaiming)  and  recovery  (reclaiming) 
v*  •  salvage  (disposal)  and  recovery  (reclaiming).  The  viewpoints  used  in 
dafialng  descriptors  for  these  concepts  (e.g.,  salvage  In  the  above 
example )  aust  be  made  clear  If  consistency  In  document  indexing 
retrieval  is  to  be  achieved . 

c .  Synonyms 

Cross  references  must  be  established  for  those  terms  which 
in  ASTIA's  environment  are  sufficiently  near  In  meaning  to  descriptors 
such  that  item  numbers  are  not  sometimes  posted  to  one  descriptor  and 
sometimes  to  another .  However,  care  must  be  taken  to  Insure  that  s ucb 
definitions  of  synonymy  are  not  made  so  broad  that  the  fine  detail  of 
description  Is  lost. 

2 .  The  Generic  Problem 

The  generic  problem  involves  the  existence  of  "family  trees” 
of  concepts  —  i.e.,  the  broadness  or  narrowness  of  viewpoint  brought 
to  bear  on  a  given  concept .  Terms  standing  for  very  narrow  viewpoints 
of  a  concept  tend  to  be  Type  C  descriptors  or  Identifiers  (e.g.,  F4U, 

Minute man ,  etc . ) .  They  will  be  very  numerous  but  so  specific  that 
their  utility  Is  limited  In  a  descriptor  Thesaurus  (as  distinguished 
from  their  utility  in  retrieval).  ’  “ 

However,  there  should  e  xl a  *  another  Thesaurus  wherein  these 
Identifiers  are  referenced  to  the  me®  t.  specific  or  lowest  ge~erlcaHy 
related  descriptors  iru.  laded  in  the  descriptor  Thesaurus  --  e.g.,  P-106 
(jet  fighter).  Identifiers  must  be  '•ross  -referenced  among  themselves 
to  prevent  confusion  in  and  d-ipllcatlon  of  terminology;  spelling  must 
be  standardized .  Because  Identifiers  are  not  under  AS  TLA  internal 
control,  full  completeness  and  consistency  cannot  be  expected  at  any 
stage . 


Descriptors  standing  for  broader  viewpoints  of  a  concept  will 
be  included  in  the  descriptor  Thesaurus  .  Each  such  descriptor  will, 
when  considered  from  any  one  viewpoint,  be  one  member  of  a  "generic 
tree".  Consider,  for  example,  the  substance  sodium  chloride  from  the 
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chemical  structure  viewpoint.  Salts  Include  halides ,  sulfides,  etc.; 
halides  Include  chlorides,  bromides,  fluorides  and  iodides;  chlorides 
Include  sodium  chloride,  aluminum  chloride,  etc.  Here  the  term  sodium 
chloride  is  a  figurative  leaf  on  the  salts  "generic  tree"  --  but  this  is 
true  when  it  is  considered  from  the  chemical  structure  viewpoint.  The 
same  substance,  when  considered  from  the  food  viewpoint,  would  be  a 
member  of  a  "generic  tree"  containing  the  term  seasoning  agents .  When 
considered  from  the  refrigerant  viewpoint,  sodium  chloride  might  even  be 
generic  to  brine . 

Each  different  viewpoint  of  the  same  concept  will  result  in  the 
concept  being  a  member  of  a  different  generic  family.  3 odium  chloride, 
for  example,  cannot  always  be  considered  as  a  seasoning  agent,  nor (ror 
that  natter)  always  as  a  refrigerant,  an  industrial  raw  material,  a 
herbicide,  etc.  Ratber,  these  are  concepts  which  may  be  related  to 
sodium  chloride  --  sometimes  on  the  same  generic  level  (l.e.,  nearly 
nyprinymniitt )  and  sometimes  on  different  generic  levels  (l.e.,  members  of 
the  same  generic  family).  Urns,  in  most  instances,  generic  relationships 
cannot  be  specified  among  descriptors;  variations  in  viewpoint  make  these 
relationships  too  trams itory. 

When  a  firm  generic  relationship  exists  among  terms,  that 
relationship  must  be  exhibited  in  the  Thesaurus ;  otherwise  attempts  at 
retrieval  based  upon  either  a  broader  or  narrower  consideration  of  the 
same  viewpoint  will  fail.  However,  even  though  a  firm  generic  relation¬ 
ship  cannot  be  specified  among  certain  descriptors,  the  possible  existence 
of  one  must  be  exhibited  in  order  to  permit  indexing  and/or  retrieval  as 
necessary  from  various  viewpoints  . 

3 .  Viewpoint  Problem 

This,  the  most  difficult  of  the  .three  basic  problems,  exhibits 
itself  as  facets  of  the  semamtlc  and  generic  problems  of  descriptors  as 
described  above.  Thus  the  basic  problems  and  their  interrelationships 
can  be  diagrammed  as  follows : 


Degree 

;  of  Variation  of  Viewpoint 

Variations  too 
frequent  to  permit 
specifying  a  rela¬ 
tionship 

Variations  sufficiently  in¬ 
frequent,  thus  permitting 
specifying  a  defined  rela¬ 
tionship 

Variations  so  marked 
(or  so  limited)  as  to 
make  confusion 
unlikely 

Semantic 

Aspects 

~1T) 

Near -synonyms  or 
partial  overlaps 

m — 

Synonyms  or  almost  complete 
overlap 

"TIT 

Homographs  (marked  vari¬ 
ation  in  viewpoint) 

Generic 

Aspects 

- 

Possible  generic  or 
inclusion  relation¬ 
ships  : 

(a)  up  (b)  down 

T?) 

Defined  generic  or  inclusion 
relationships : 

(a)  up  (b)  down 

(6) 

Identifiers  (limited 
variation  in  view¬ 
point  ) 
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B.  Plan  of  Attack  Upon  Interdescrlptor  Relationship  Problems 


The  above  diagram  thus  defines  six  specific  Interdescriptor  relation¬ 
ship  problems,  and  the  plan  of  attack  upon  each  of  these  is  set  forth 
below. 


1 .  Near  Synonyms 

Here  there  is  a  definite  relationship  between  descriptors .  The 
idea,  concept,  or  meaning  of  one  descriptor  partially  overlaps  that  of 
another  descriptor,  e.g.,  disposal,  recovery,  and  salvage,  lb  us,  .or  at 
least  part  of  these  meanings,  we  have  different  words  for  the  sane  thing 
Tbe  relationship  in  this  example  is  only  sometimes  one  of  synonymy,  but 
often  is  not,  depending  upon  the  variations  in  viewpoint.  Tfcus  the 
Thesaurus  should  indicate  that  there  ls_  a  relationship  or  partial  over- 
lapping  between  certain  of  the  descriptors,  although  the  exact  form  of 
that  relationship  (or  even  its  existence  at  all  from  some  viewpoints) 
cannot  always  be  specified 


The  "Also  See”  reference  is  Indicated  in  this  circumstance.  It 
must  be  from  descriptor  to  descriptor.  However,  it  must  be  recognized 
that  the  following  exemplary  condition  may  prevail: 


Generators  has  "Also  Gee"  reference  to  motors ■ 

Motors -  has  "Also  See"  references  to  generators  and  to  drives. 

Drives’  has  "Also  Gee"  reference  to  motors  ■ 

There  may  be  no  Also  Gee'  reference  between  generators  and 
drives  because  they  may  not  be  inherently  related,  although  both_may  be 
related  (from  different  viewpoints)  tc  motors  .  Thus,  one  cannot,  expect 
all  "Also  Gee"  references  to  be  cocxnutative .  e.g.,  the  Also  See 
references  of  generators  will  net  match  exactly  those  of  motors ■ 


2 .  Gyncnymo 

Here,  variations  among  viewpoints  tin  the  AGTIA  environment) 
of  two  or  more  terns  are  adjudged  to  be  so  infrequent  or  so  minor,  and 
the  difference  in  generic  level  Is  so  minor,  that  a  relationship  of 
synonymy  can  easily  be  specified.  Care  must  be  taxer,  not  to  specify 
synonymy  when  variations  in  viewpoint  are  sc  frequently  encountered,  or 
are  so  marked,  as  to  maxe  the  specification  untenable.  The  most  fre¬ 
quently  encountered  "synonymous"  descriptor  should  be  used  as  the 
descriptor  referred  tc  fr:m  the  "synonymous"  terms  used  less  frequently. 
The  "Use"  reference  is  indicated  in  ♦nls  cin  urastance . 

Terms  affixed  with  "Use"  references  should  definitely  be 
Inserted  in  their  proper  alphabetical  order  in  the  "Scope  Note  Index 
(or. its  equivalent)  cf  the  Thesaurus  --  as  is  done  at  present.  If  a 
term  is  a  synonym  f f rem  twe  or  more  viewpoints)  of  more  than  one  other 
descriptor,  there  is  nothing  wrong  with  a  reference  such  as  Use 
Descriptor  A  or_  Descriptor  B.' 


For  purposes  of  future  updating,  it  is  advisable  to  provide 
"Included  references,  which  would  be  affixed  to  descriptors  "Used"  in 
lieu  of  other  terms;  e.g.,  induction  heating  will  have  an  "Includes”^ 
reference  for  every  term  which  is  referenced  "Use  induction  heating • 

3 .  Homographs 

Treatment  of  these  descriptors  is  simple,  requiring  only  a 
"scope  note"  (such  as  the  present  parenthetical  Descriptor  Group  name  or 
other  types  of  defining  phrases). 

1*.  Possible  Generic  Relationships 

Ttoe  same  comments  apply  here  as  to  the  "near  synonym  relation¬ 
ships  (see  above),  except  that  variations  in  viewpoint  affect  the  existence 
or  absence  of  generic  relationships  rather  than  that  of  synonymy. 

Here,  too,  use  of  the  "Also  See"  reference  is  indicated,  Just 
as  for  the  relationship  of  "near  synonymy."  The  comment  about  noncommuta- 
tiveness  of  the  "Also  See"  references  also  ..pp^es  here. 

5.  Defined  Generic  Relationships 

Here,  variations  among  viewpoints  (in  the  aSTIA  environment)  of 
two  or  more  descriptors  may  be  adjudged  to  be  so  Infrequent  or  so  minor, 
while  at  the  same  time  the  difference  in  generic  level  is  significant, 
that  a  generic  relationship  may  be  specified.  Care  must  be  taken  not  to 
specify  a  generic  relationship  when  variations  in  viewpoint  are  so 
frequently  encountered  or  are  so  marked  as  to  make  the  specifications 
untenable . 


ASTIA 's  philosophy  of,  and  guidelines  for,  the  treatment  of 
generic  relationships  is  as  follows ; 

In  terms  of  the  known  state-of-the-art,  there  does  not  exlt't  a 
field-tested,  automated  system  which  solves  the  problems  of  indicating 
unambiguously  vertical  relationships  for  a  multidiscipline  library. 

This  fact,  true  in  May  i960  and  true  for  November  1,  19&1,  explains  why 
ASTIA  exhibited  no  such  relationships  in  the  first  edition  of  its  Thesaurus . 
Because  ASTIA’ 6  investment  and  expanding  role  in  the  better  utilization 
of  American  scientific  and  technological  know-how  cannot  be  Jeopardized, 
no  presently  known  scheme,  no  matter  now  attractively  or  logically 
argued  013  paper,  can  be  supported  at  this  time. 

Whatever  the  final  disposition  of  the  generic  problem,  any 
solution  must  be  based  on  a  controlled  vocabulary .  "Control  not  only 
includes  the  authorization  of  a  term  as  a  descriptor  and  the  definition 
of  that  term  but  also  encompasses  relationships  among  descriptors. 

ASTIA  proposes  a  system  of  indicating  generic  relationships 
which  essentially  treats  of  generic  relationships  among  Type  B  descriptors. 
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In  addition,  ASTIA  proposes  to  develop  further  techniques  in  generic 
indexing  by  prescribing  relationships  and  usage  between  TVP®  B  and 
Type  A  descriptors  . 

Typically,  a  hierarchy  will  be  created  when  the  relation  of 
generification  is  specified;  e.g.,  all  masers  are  microwave  amplifiers, 
i  microwave  amplifiers  are  amplifiers ,  all  amplifiers  are  electronic 
equipment,  etc . 

Use  of  the  "Generic  To"  reference  is  Indicated  in  this  cir¬ 
cumstance-  For  example,  the  descriptor  microwave  amplifiers  would  be 
referenced  "Generic  To  masers "  (as  well  as  other  types  of  microwave 
amplifiers  covered  by  the  vocabulary).  The  descriptor  amplifiers 
would  be  referenced  "Generic  To  microwave  amplifiers,  etc .  (where  "etc . " 
refers  to  other  types  of  amplifiers  than  microwave  amplifiers  as  well  as 
to  specific  kinds  of  both  the  other  types  of  amplifiers  and  of  microwave 
amplifiers ) .  The  descriptor  electronic  equipment  would  be  referenced 
"Generic  To  ampllf lers,  microwave  amplifiers,  masers,  etc."  (where  the 
"etc."  includes  other  types  of  electronic  equipment  as  well  as  all  that 
was  included  by  amplifiers ) . 

In  order  to  improve  the  thesaurus  as  a  vocabulary  reference 
tool,  the  standard  dictionary  practice  of  indicating  the  higher  generic 
references  is  also  recommended.  For  example,  masers  would  be  referenced 
"Add  microwave  amplifiers,  amplifiers,  electronic  equipment."  The 
descriptor  microwave  amplifiers  referenced  "Add  amplifiers,  electronic 
equipment ."  The  descriptor  amplifiers  would  be  referenced  "Add  electronic 
equipment . " 


6.  Identifiers 


While  these  terms  should  not  be  part  of  the  descriptor  Thesaurus , 
each  of  them  should  be  ’tagged"  with  a  descriptor,  thus  creating  ( in 
effect)  an  identifier  Thesaurus .  The  "tags  should  consist  of  higher 
generic  levels  of  the  concepts  symbolized  by  the  identifiers. 

C .  Procedures  and  Criteria  for  Selection  and  Deletion  of  Descriptors 

Concepts  to  be  expressed  by  Type  B  descriptors  are  selected  (l)  from 
accessioned  technical  information,  (2)  from  bibliographic  requests,  and 
(3)  by  refinement  of  Type  B  descriptors  which  have  been  used  frequently 
in  processing  information  or  requests. 

In  the  first  case,  novel  concepts  which  are  thought  to  be  candidates 
for  Type  B  descriptors  may  be  extracted  from  current  documents  and  assigned 
as  identifiers  in  order  to  determine  their  frequency  of  appearance  (and 
corresponding  utility  as  Type  B  descriptors)  and  to  record  the  document 
numbers  involved  for  updating  of  the  retrieval  tapes  if  the  concept  is 
subsequently  incorporated  into  the  Thesaurus . 
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In  the  second  case,  concepts  which  have  not  previously  been  recognized 
by  assignment  of  Type  B  descriptors  or  identifiers  may  be  revealed  by 
users'  questions  and  can  be  added  when  the  pertinent  documents  are 
identified . 


In  the  third  case,  Type  B  descriptors  which  are  quite  frequently  used 
indicate  (to  some  extent)  concepts  which  may  not  be  specific  enough  for 
efficient  retrieval.  StatisticaT  studies  of  the  assignment  to  documents 
Of  such  descriptors  are  now  made  periodically  to  indicate  which  of  them 
should  be  considered  for  refinement. 


Suggestions  for  new  descriptors  (originating  from  all  three  of  the 
aforementioned  sources)  are  now  evaluated  in  view  of  the  icgical,  generic, 
and  syntactical  relationships  to  other  descriptors;  in  view  of  definitions 
and  the  authority  therefor;  in  view  of  the  frequency  with  which  the  concept 
has  appeared  in  the  collection  to  date;  and  in  view  of  the  utility  of  the 
term  in  processing  bibliographic  requests.  Decisions  as  to  the 
descriptor  terminology  to  be  employed  are  baaed  on  the  usage  in  textbooks, 
dictionaries,  and  other  authoritative  sources,  as  well  as  that  found  in 
the  ASTIA  collection.  A  descriptor  proposal  form  (Attachment  II)  has 
been  used  with  considerable  success  within  ASTIA  for  evaluating  descriptor 

suggestions . 


Ttoose  descriptors  which  (l)  experience  has  indicated  to  be  too 

specific  for  efficient  retrieval,  (2)  represent  outmoded  terminology, _ 

or  (3)  have  been  used  very  infrequently  in  processing  current  information 
and  requests  are  candidates  for  deletion  from  the  Thesaurus ■ 


HI.  IQH -THESAURUS  COilSIDERATlOWS 


A.  flames  of  Chemical  Compounds 

Although  according  to  the  previous  discussion  the  names  of  chemical 
compounds  might  be  treated  as  identifiers,  it  may  happen  that  certain^ 
names  of  specific  compounds  should  not  be  included  even  in  the  identifier 
Thesaurus.  They  should,  of  course,  be  ’’tagged’  with  the  names  of  their 
"chemical  families,"  which  should  be  Thesaurus  descriptors.  Because  a  ^ 
chemical  compound  will  usually  belong  to  more  than  one  chemical  family, 
names  of  chemical  compounds  may  thus  turn  out  to  be  "exceptional  sorts 
of  identifiers.  This  "exception"  situation  indicated  that  a  different 
(possibly  nonthesaurus )  approach  should  be  taken  to  the  Indexing  of 
chemical  compounds  generally  -  and  this  should  be  the  object  of  a  separate 

study. 


B.  Syntactical  Problems 

Only  three  basic  problems  (viewpoint,  generic,  and  semantic)  are 
discussed  above.  There  is  a  fourth  problem,  that  of  syntax^ which  s 
relatively  independent  of  the  Thesaurus  .  Whether  or  not  ASTIA  should 
place  syntactical  constraints  upon  the  descriptors  is  something  that 
should  be  considered  entirely  apart  from  its  studies  of  Thesaurus 
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revision  at  this  time;  any  reasonable  system  of  syntactical  constraints 
will  be  compatible  with  any  operationally  successful  Thesaurus . 

Syntactical  constraints  would  be  employed  principally  to  prevent 
"false  drops"  via  preventing  the  invalid  coordination  of  descriptors 
during  retrieval.  They  can,  however,  also  serve  a  useful  purpose  by 
inking  it  possible  t^  provide  (in  response  to  a  search)  not  only  a  set 
of  citation  numbers  ("addresses"  of  retrieved  information)  but  also, 
for  each  citation,  the  descriptors  associated  vith  the  information 
listed  in  ordered  sequence . 

Moet  frequently,  role  indicators  are  used  as  syntactical  constraints, 
although  (when  average  depth  of  indexing  exceeds  30  to  UO  descriptors  per 
document)  association  link*  nay  be  used  as  well.  Role  indicators  pro¬ 
vide  clues  as  to  the  role  a  descriptor  plays  in  the  given  document  (e.g., 
raw  material,  production  of,  design  of,  research  on,  etc.);  as  such 
they  enable  the  listing  of  descriptors  in  ’’orie red  context."  Association 
links  are  employed  when  the  document  being  Indexed  is  so  complex  that 
it  must  be  indexed  as  if  it  were  more  than  one  document. 

Role  indicators  must  be  few  in  number  and  (insofar  as  possible) 
mutually  exclusive  and  collectively  exhaustive.  The  design  of  a  good 
set  of  role  indicators  is  a  major  Intellectual  and  experimental  task  not 
to  be  lightly  undertaken.  On  the  one  band,  role  indicators  may  be  implicit 
anyway  in  some  standard  descriptor  systems.  On  the  other  hand,  the  use 
of  explicit  role  indicators  may  make  the  algebra  of  the  entire  process 
non-Boolean.  However,  once  a  role  indicator  system  is  designed,  the 
use  of  the  indicators  will  add  only  10  to  20*  to  the  cost  of  indexing 
and  about  the  same  amount  to  the  size  of  the  index  (because  about  10 
to  20*  of  the  descriptors  assigned  to  each  document  will  carry  two  roles). 
The  use  of  association  links,  on  the  other  band,  will  add  50  bo  150* 
to  the  cost  of  indexing  and  to  the  size  of  the  index. 

Finally,  this  kind  of  structural  constraint,  while  specifically 
useful  for  certain  aspects  of  chemical  literature,  may  be  quite 
inoperable  in  retrieval  practice  for  many  reports  in  other  branches 
of  science  and  technology. 

IV.  IMPLICATIONS  OF  OUTLINED  PLANS  AND  GUIDELINES 


The  implementation  of  the  aforementioned  plans  implies  certain 
other  actions  which  will  result  automatically.  These  are  discussed 
below . 


A.  Multiword  Descriptors 

It  is  recognized  that  the  inclusion  in  the  Thesaurus  of  numerous 
multiword  descriptors  tends  to  increase  the  number  of  "Also  See"  references, 
to  reduce  the  number  of  "Use"  (and  "Includes)  references,  and  to  have 
little  effect  upon  the  number  of  "Generic  To"  (suid  "Add")  references. 

The  net  effect  is  to  cause  the  Thesaurus  to  be  physically  larger,  because 
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of  the  larger  number  of  vocabulary  descriptors  generated  by  the  various 
word  combinations,  but  even  more  so  because  of  the  proliferation  of 
"Also  See"  references  (already  the  most  numerous  type  of  reference). 

For  this  reason,  during  the  revision  of  the  Thesaurus,  It  is 
expected  that  some  of  the  descriptors  (represented  by  single  or  multiple 
words)  will  be  "spilt"  Into  descriptors  of  simpler  (and  broadeF)  meaning. 
However,  "splitting"  must  be  avoided  if  the  meaning  of  anjr  part  of  the 
"split"  descriptor  is  distorted  from  its  meaning  in  the  combined  form, 

(e .  half-life  vs.  half  and  life  and  air -to -surface  vs.  air  and  surface ) 
or  if  the  multiword  descriptor  is  already  hssvlljr  used.  In  this  latter 
case,  "splitting"  of  the  heavily  used  multiword  descriptor  may  result 
in  the  expenditure  of  excessive  personnel  and  machine  time  to  ’ recoordinate' 
these  descriptors  when  servicing  inquiries. 

It  should  be  noted  that  any  steps  to  minimize  the  appearance  of 
multiword  descriptors  in  the  Thesaurus  need  not  prevent  the  operators 
of  retrieval  machinery  from  maintaining  their  own  "precoordinations' 
of  popular  combinations  of  "split"  terms  for  their  own  operational 
convenience . 

n.  Descriptor  Group  Redesign 

It  is  recognized  that  many  changes  are  desirable  in  the  design  of 
descriptor  groups  in  order  to  implement  the  Type  A  descriptor  concept. 

Some  groups  will  be  eliminated  via  absorption  lhto  existing  groups; 
others  will  be  eliminated  by  being  split  into  newly  defined  groups; 
group  names  will  be  modified.  ASTIA  is  already  active  in  this  work; 
however,  it  is  expected  that  Turthbr  modifications  to  descriptor 
groupings  will  result  from  an  overall  examination  of  the  results  of 
the  revision  of  the  descriptor  Thesaurus .  This  would,  of  course,  be 
a  task  to  be  undertaken  after  the  completion  of  the  revision  discussed 
in  this  paper. 

V.  IMPLEMENT ATI  OH  OF  PLAN 3 

A .  Independent  Activities  of  AgTIA 

As  recognized  by  the  aforementioned  ad  hoc  committee,  ASTIA  is 
already  proceeding  with  work  leading  to  refinement  of  the  Thesaurus . 

Aside  from  descriptor  group  redesign  (see  above)  and  routine  maintenance 
work  (addition  of  new  descriptors,  cross  references,  etc.),  other  non- 
routine  activities  are  In  progress  and  are  described  below. 

1 .  Field  Stabilization 

This  will  Involve  the  elimination  of  the  present  Field  No.  13 
(Miscellaneous  Arts  snd  Sciences)  And  tBe  creation  of  a  special  pseudo- 
field  to  contain  all  general  Type  B  descriptors  which  are  not  subject- 
matter  equivalent  to  other  Type  B  descriptors . 
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2 .  Lov -Frequency  Descriptor  File 

Low-frequency  Type  B  descriptors  are  being  evaluated  for 
deletion  of  obsolete  descriptors,  and  a  file  of  about  2.000  low-frequeney 
Type  B  descriptors  Is  being  created .  This  file  will  be  used  to  perfora 
manual  searches .  Bibliographic  searches  Involving  these  descriptors  are 
■ore  quickly  handled  by  hand  than  by  Machine 

3 .  High-Frequency  Descriptor  Refinement 

AS  TLA  Is  Investigating  the  descriptors  with  highest  use 
frequency  to  determine  whether  these  descriptors  should  be  more  precisely 
defined  and  whether  the  subject  matter  described  by  these  descriptors  can 
be  better  described  by  new,  more  specific  descriptors . 

4 .  Oeneral  Plans 

AS  TLA  plans  to  available  the  appropriate  amounts  and 

quality  of  personnel  and  machine  time  to  permit  implementation  of  the 
Thesaurus  revision  program.  In  fact,  people  and  machines  are  already 
active  on  the  Initial  phases  of  this  work. 

B.  Plans  for  Cooperative  Activity 

Two  major  activities  are  planned  In  which  AS  TLA  Invites  participation 
by  others.  One  of  these  Is  the  development  of  an  identifier  Thesaurus  — 
a  device  to  be  created  along  the  lines  heretofore  discussed.  The  other 
Is  the  revision  of  the  existing  Thesaurus  of  AS  TLA  Descriptors . 

1 .  Identifier  Thesaurus 

This  worn  can  being  lnsaediately .  AS TLA  Invites  active  parti¬ 
cipation  by  small  groups  working  in  series  or  In  parallel  at  ASTLA . 

The  nature  of  this  material  permits  groups  with  specialized  Interests 
to  participate,  because  this  material  can  be  broken  Into  such  subject 
areas  as  chemistry,  electronics,  aeronautics,  etc. 

ASTLA  believes  that  the  construction  of  an  adequate  Identifier 
Thesaurus  Is  as  Important  as  the  construction  of  a  descriptor  Thesaurus, 
unri  therefore  rates  this  project  as  having  1-A  priority. 

2 .  Descriptor  Thesaurus 

The  aforementioned  ad_  hoc  committee  suggested  that  assistance 
by  others  to  ASTLA  In  this  particular  effort  should  be  provided  by  a 
small  group  of  no  more  than  about  six  people  working  at  ASTLA  with 
appropriate  ASTLA  personnel.  The  composition  of  this  group  need  not 
be  constant  throughout  the  endeavor,  but  obviously  the  rotation  Into 
the  task  force  of  new  members  should  be  spaced  to  preserve  a  maximum 
continuity  of  experience. 
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The  experience  of  ASTIA  (and  of  others  who  have  developed 
thesauri)  has  been  that  lisplenentatlon  of  the  "■■all  task  force" 
concept  Is  both  feasible  and  essential .  Because  of  the  great  Interest 
expressed  by  several  groups  representing  the  scientific  and  Industrial 
communities,  precedence  will  be  given.  In  choosing  the  membership  of 
the  task  force,  to  organisations  which  are  engaged  in  thesaurus  develop¬ 
ment.  The  task  force  will  operate  within  the  guidelines  and  principles 
outlined  In  this  paper  insofar  as  sound  Judgement  and  experience  dictate. 
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REPORT  OF  THE  GfMMIfTEE  FOR  DEVELOPMENT  OF  CRITERIA 
FfP  GUIDELINES  FOR  THESAURUS  REVISION 


MEMBERS: 


C.  N. 

Mooers,  fhairman 

E.  B. 

Hincks 

B.  E. 

Holm 

J.  V. 

Philbrick 

P.  H. 

Klingbiel 

A.  J. 

Neumann 

FOUR  MAJOR  POINTS  ARE  SUBMITTED: 


I.  Concurrence  that  ASTIA  by  1  November  1961  prepare  and  circulate 
a  first  draft  document  covering  the  matters  set  forth  in  Item  6 
of  a  list  of  recommendations*  adopted  by  a  thesaurus  evaluation 
group  which  met  1^-15  August  1961. 

II.  Recommendation  that  the  following  be  considered  as  a  suggested 
manner  of  preparing  said  document: 

A.  Deadline  by  1.  November  1961  for  first  draft. 

B.  The  task  force  for  the  work  should  be  members  of  ASTIA 
staff  plus  any  outside  personnel  specifically  assigned  to 
the  task  or  under  Cdntr&ct. 

C.  ASTIA  staff  should  be  released  for  a  preset  number  of  hours 
per  day,  and  given  non-distracting  quarters,  for  work  on 
this  draft. 

D.  ASTIA  computer  time  should  be  made  available  for  any  quanti¬ 
tative  assistance. 

(Note:  This  Section  II  advisory  only.) 

III.  Concurrence  that  bringing  in  working  group  members  from  outside 
ASTIA  to  work  on  the  Thesaurus  revision  is  to  follow  completion 
of  said  document.  (This  is  not  to  preclude  ASTIA  from  under¬ 
taking  refining  work  immediately  on  the  Thesaurus.  It  is  merely 
that  those  on  the  outside  want  definite  guidelines  to  follow 
before  they  can  beneficially  work  on  Thesaurus  refinement.) 

IV.  Concurrence  that  the  topics  to  be  elaborated  upon  in  this  docu¬ 
ment  include  points  A  and  B  below: 


*Item  6  of  that  meeting  reads  as  follows:  Preparation  and  publication 
of  procedures,  criteria,  and  standards  for  entry  and  deletion  of 
retrieval  terms;  establish  procedures  for  notification  to  users  of 
Thesaurus  changes  on  a  periodic  basis. 
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Atch  2 


A.  Develop  criteria  for  different  kind*  of  descriptors,  their 
use,  change,  updating,  deletion,  etc.  These  three  kinds  of 
descriptors  were  discussed: 

KIND  A:  Group  Head  Terns  or  their  equivalent  vhlch  nay  he 
suitable  for  lnterllbrary  compatibility. 

KIND  B:  Descriptors  of  the  general  kind  nov  used  in  ASTIA. 

KIND  C:  Terms  similar  to  A3  TLA  Identifiers. 

ASTIA  is  to  draw  up  criteria  on  their  distinction. 

B.  Delineate  hov  to  handle: 

1.  Relationships  and  cross  references 

2.  Hlerachles 

3>  Any  other  relationships 

V.  This  Committee  further  c occurred  and  advised  as  follovs : 

A.  Descriptors  of  KIND  B  cannot  be  made  completely  compatible 
between  libraries  or  fron  system  to  systen.  (Compatibility 
between  systems  will  occur  with  KIND  A  primarily . ) 

B.  The  Thesaurus  should  be  aimed  specifically  to  be  a  tool  for 
the  librarian  in  a  documentation  center.  (Descriptors  of 
KIND  A,  however,  should  be  unable  by  engineers  for  easy 
assignment  to  papers  sent  to  other  organisations.) 

C.  Assistance  to  ASTIA  in  the  revision  of  the  Thesaurus  nust  be 
by  a  snail  outside  group,  such  as  6  people  or  less.  (Member¬ 
ship  of  the  outside  participating  group  nay  be  rotating.) 

D.  The  presumption  nust  be  made  that  machines  will  be  more  than 
capable  to  handle  descriptor  retrieval  manipulations . 

E.  Terminology  such  as  "Descriptor,"  "Keyword,"  etc.  nust  be 
precisely  defined  In  draft  dociaent  and  so  used. 

7.  Support  should  be  provided  (machine  time,  programming)  to 

utilize  data  already  available  such  as  descriptor  frequencies, 
and  the  like. 


CALVIN  N.  MOOERS 

Chairsmn  18  October  19^1 
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DESCRIPTOR  PROPOSAL 


1. 

Descriptor : 

Group  No. 

Date : 

2. 

Proposed  cross  references: 

Incl:  Also  See:  Submitted  by: 

Coordination : 


3.  Proposed  definition: 


k.  Authority  (literature  citations  and  references  to  AP  numbers): 


5.  Information  on  this  subject  presently  contained  in  the  AD  collection 
might  be  retrieved  by: 
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